Josh Wills
Author
Publisher
O'REILLY MEDIA
Pub. Date
2022
Language
English
Description
The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming. Data scientists Akash...
Author
Publisher
O'Reilly
Pub. Date
2017.
Language
English
Description
The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by presenting examples and a set of self-contained patterns for performing large-scale data analysis with Spark. You'll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques-classification, collaborative filtering, and anomaly detection among others-to fields such...
Search Tools Get RSS Feed Email this Search